wo 98/16552 _ 36 - PCT/EP97/04744 



SEQUENCE LISTING 



(1) GENERAL INFOl^TION: 
(i) APPLK 

(A) NAME\ Max-Planck-Gesellschaf t zur Foerderung der 
Wissenschaf ten e.V. Berlin 

(B) STREEI: Hofgartenstr . 2 

(C) CITY: Muenchen 

(E) COUNTRrc Germany 

(F) POSTAL ODDE (ZIP) : 8053 9 

(ii) TITLE OF INVENTION: Helicobacter pylori live vaccine 
(iii) NUMBER OF SEQUENCES: 6 

(iv) COMPUTER READABLE^ FORM : 

(A) MEDIUM TYPE:\Floppy disk 

(B) COMPUTER: IBmPC compatible 

(C) OPERATING SYSTCIM : PC-DOS/MS-DOS 

(D) SOFTWARE: Patei^tln Release #1.0, Version #1.3 0 

(EPO) 

(2) INFORMATION FOR SEQ ID NO : \l 

(i) SEQUENCE CHARACTERISTIC 

(A) LENGTH: 1557 base ^airs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomi\:) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Helicobacter \pylori 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: alpB 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .1554 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

ATG ACA CAA TCT CAA AAA GTA AGA TTC TTA GCC\CCT TTA AGC CTA GCG 
48 

Met Thr Gin Ser Gin Lys Val Arg Phe Leu Ala J?ro Leu Ser Leu Ala 

1 5 10 A 15 

TTA AGC TTG AGC TTC AAT CCA GTG GGC GCT GAA QM^ GAT GGG GGC TTT 
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Phe Asn Pro 



- 37 - 



Val Gly 
25 



TAT GAA TTA GGT GAG 

Tyr Glu Leu Gly Gin 
40 



Ala Glu Glu Asp 

GTG GTC CAA CAA 

Val Val Gin Gin 
45 

GCC GGC TTG TTA 

Ala Gly Leu Leu 
60 




ACA ACA 

240 
Thr Thr 
65 

GCC GGG 

288 
Ala Gly 



GAT TTG 

336 
Asp Leu 



ACT AAT 

384 
Thr Asn 



GCT ACT 

432 
Ala Thr 
130 

AGA AAA 

480 
Arg Lys 
145 

AAT ATC 

528 
Asn lie 



CAG CTC 

576 
Gin Leu 



ACT TTG 
Thr Leu 

TAT CCC ACT TTG AAC 

Tyr Pro Thr Leu Asn 
100 

AGT GGT AGT AGT AGT AGT GGT 
Ser Ser Ser 



Ser Gly 
115 

ACT AGC 
Thr Ser 



AAT AAG 
Asn Lys 



CCT 

Pro 
135 



Ser Gly 
120 

TGT TTC 
Cys Phe 
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Gly Gly Phe 
30 

GTG AAA AAC 
Val Lys Asn 

AAC TCT ACC 
Asn Ser Thr 

GGC AAT GTC 

Gly Asn Val 
80 

AAT TTG ATT 

Asn Leu lie 
95 

ACA CAA TGT GGC ACT 

Thr Gin Cys Gly Thr 
110 

GCG GCC ACA GCC GCT 

Ala Ala Thr Ala Ala 
125 

CAA Gci^AAC CTG 
Gin Gly Agn Leu 



GAT CTT TAT 
Asp Leu Tyr 



ATG GTT GAC TCT ATC 

Met Val Asp Ser lie 
150 

TTT CAA GGC AAC AAC 

Phe Gin Gly Asn Asn 
165 



AAA ACT TTG AGT 

Lys Thr Leu Ser 
155 

AAC ACC ACG AGC 

Asn Thr Thr Ser 
170 



CAAN^AC ATC AGC AAG 

Gin A\n lie Ser Lys 

160 



CAA AAT 
Gin Asn 



:TC TCC AAC 

LeW Ser Asn 
175 



AGT GAG 

Ser Glu 
180 



CTT AAC 
Leu Asn 



ACC GCT AGC 

Thr Ala Ser 
185 



GTT TAT 
Val Tyr 



TTG ACT TAG AAC 

Leu Thr Tyr Met\Asn 
190 
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tk:g ttc tta aac 

\624 
Ser^he Leu Asn 
\ 195 

ACT AAT OAA GCT 

672 \ 
Thr Asn Gln>Ala 
210 \ 

ATC CTA AAG CAA 
720 

lie Leu Lys Gin 
225 

GCT GCC GCA GCG 
768 

Ala Ala Ala Ala 



TCC GCT AAC GCC 
816 

Ser Ala Asn Ala 
260 

GTG CAA AAT ATC 
864 

Val Gin Asn lie 
275 

AAC GCT AAC ATC 
912 

Asn Ala Asn lie 
290 

AAT ATT GAT CAA 
960 

Asn lie Asp Gin 
305 

ACT TTG GCT AAA 

1008 
Thr Leu Ala Lys 



TGG CTT GGG AAT 
1056 

Trp Leu Gly Asn 
340 

AAC GGG TTT ATC 
1104 

Asn Gly Phe lie 
355 



GCC AAT AAC CAA 

Ala Asn Asn Gin 
200 

TAT GGA AAT GGG 

Tyr Gly Asn Gly 
215 

GOT TCA ATC ACT 

Ala^^r lie Thr 
23>0 

TTT TTG QAT GCC 

Phe Leu AspxAla 
245 \ 

GGG AAC GAT TTG 

Gly Asn Asp Leu 

GTC AAT AAT TCT 

Val Asn Asn Ser 
280 

AGC AAT TCA ACA- 

Ser Asn Ser Thr 
295 

GCG CGA TCT ACC 

Ala Arg Ser Thr 
310 

GTT AGC GCT TTG 

Val Ser Ala Leu 
325 

TTT GCC GCC GGT 
Phe Ala Ala Gly 

ACT AAA ATC GGT 

Thr Lys lie Gly 
360 



GCG GGT GGG ATT 
Ala Gly Gly He 

GTT ACC GCT CAA 

Val Thr Ala Gin 
220 

ATG GGG CCA AGC 

Met Gly Pro Ser 
235 

GCT TTA GCG CAA 

Ala Leu Ala Gin 
250 

SAGC GCT AAG GAA 

Set Ala Lys Glu 
265\ 

CAA AAC GCT TTA 
Gin Asn Ala Leu 

GGC TAT CAaNjTG 
Gly Tyr Gin V^ 

3oa\ 

CAA CTA TTA AAC 

Gin Leu Leu Asn 
315 

AAT AAC GAG CTT 

Asn Asn Glu Leu 
330 

AAC AGC TCT CAA 

Asn Ser Ser Gin 
345 

TAC AAG CAA TTC 
Tyr Lys Gin Phe 



TTT CAA AAC AAC 

Phe Gin Asn Asn 
205 

CAA ATC GCT TAT 
Gin He Ala Tyr 

GGT GAT AGC GGT 

Gly Asp Ser Gly 
240 

CAT GTT TTC AAC 

His Val Phe Asn 
255 

TTC ACT AGC TTG 

Phe Thr Ser Leu 
270 

ACG CTA GCC AAC 

Thr Leu Ala Asn 
285 

AGC TAT GGC GGG 
Ser Tyr Gly Gly 

AAC ACC ACA AAC 

Asns. Thr Thr Asn 
\ 320 

AAA GOT AAC CCA 

Lys Ala^sn Pro 
3S35 

GTG AAT GCGv TTT 

Val Asn Ala Phe 
350 \ 

TTT GGG GAA AAC 

Phe Gly Glu Asn 
365 
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AAT GTG GGC 
52 

Lys >^sn Val Gly 
0 

GGC GTGVGGT AAT 

1200 
Gly Val GIV Asn 
385 

GGG GTG GGG AC 

1248 
Gly Val Gly Thr 



- 39 - 

TTA GGC TAG TAG GGC TTC TTC AGC 
Tyr Gly 



Leu Arg Tyr 
375 



Phe Phe Ser 
380 



GGC CCT ACT TAG AAT CAA GTC AAT 
Tyr Asn 



Gly Pro Thr 
390 

GAT GTG CTT 
Val Leu 



Gin Val Asn 
395 



TAG AAT GTG TTT AGC 
Tyr Asn 



AGT AGG AGT 

1296 
Ser Arg Ser 



GAT ACT TAG 

1344 
Asp Thr Tyr 
435 

CCT ACA GCG 

1392 
Pro Thr Ala 
450 

AAC TTT GGT 

144 0 
Asn Phe Gly 
465 

ATA GAA ATC 

1488 
He Glu He 



CTT 

Leu 
420 

ATC 

He 

ACG 
Thr 



4 

AAT QCG GGC TTC TTT 

Asn Ala\Gly Phe Phe 

425 

AGC ACG CTAvAGA AAC 

Ser Thr Leu Aira Asn 
441 

AAA TTC CAA TTC IVTC 
Phe 



Val Phe Ser 
410 
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TAT AAC GGC GCG 
Tyr Asn Gly Ala 

TTG CTC ACT TAT 

Leu Leu Thr Tyr 
400 

CGC TCT TTT GGT 
Arg Ser 



GGG GGG ATC CAA CTC 

Gly Gly He Gin Leu 

430 

AGC TCT GAG CTT GCG 

Ser Ser Gin Leu Ala 
445 



Lys Phe Gin 
455 



ATC TTG 'AAA AAA 

He Leu Lys Lys 
470 

GGT GTG CAA ATC 
Gly 



Val Gin He 
485 



Leus 

GAC TTG 
Asp Leu 

CCT ACG 
Pro Thr 



TTT GAT GTG 

Phe Asp Val 
460 



GGC TTA 
Gly Leu 



Phe Gly 
415 

GCA GGG 
Ala Gly 

AGC AGA 
Ser Arg 

CGC ATG 
Arg Met 



GCT GGC GGT GGT 

1536 
Ala Gly Gly Ala 
500 

GTC TAT GGC TAG 

1557 
Val Tyr Gly Tyr 
515 



GAA GTG AAA TAG TTC 
Glu Val Lys 



Tyr Phe 
505 



Lys 

ATT 

He 
490 

CGC 

Arg 



AGC CAT AAC GAG CAT TCT 
»r His 



47> 



Asn Gin His Ser 
480 



TAG AAC ACT TAG TAT AAA 

Tyr Asr\Thr Tyr Tyr Lys 

495 

CCT TAT AG^L GTG TAT TGG 
Pro Tyr Ser V^l Tyr Trp 



GCC TTC TAA 
Ala Phe 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 518 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii)\ MOLECULE TYPE: protein 
(xi) ^SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Thr Gln\Ser Gin Lys Val Arg Phe Leu Ala Pro Leu Ser Leu Ala 
1 \ 5 10 15 

Leu Ser Leu Serv Phe Asn Pro Val Gly Ala Glu Glu Asp Gly Gly Phe 
20 \ 25 30 

Met Thr Phe Gly T>^x Glu Leu Gly Gin Val Val Gin Gin Val Lys Asn 
35 \ 40 45 

Pro Gly Lys lie Lys Ala\Glu Glu Leu Ala Gly Leu Leu Asn Ser Thr 

50 \55 60 

\ 

Thr Thr Asn Asn Thr Asn lie. Asn lie Ala Gly Thr Gly Gly Asn Val 
65 70 \^ 75 80 

Ala Gly Thr Leu Gly Asn Leu Phe Met Asn Gin Leu Gly Asn Leu lie 

85 \ 90 95 

Asp Leu Tyr Pro Thr Leu Asn Thr Ser Asn lie Thr Gin Cys Gly Thr 
100 loX 110 

Thr Asn Ser Gly Ser Ser Ser Ser Gly\ly Gly Ala Ala Thr Ala Ala 
115 120 \ 125 

\ . 

Ala Thr Thr Ser Asn Lys Pro Cys Phe Gin Gly Asn Leu Asp Leu Tyr 
130 135 \ 140 

Arg Lys Met Val Asp Ser lie Lys Thr Leu Ser^Gln Asn lie Ser Lys 
145 150 155 \ 160 

Asn lie Phe Gin Gly Asn Asn Asn Thr Thr Ser Gln\Asn Leu Ser Asn 

165 170 \ 175 

Gin Leu Ser Glu Leu Asn Thr Ala Ser Val Tyr Leu Thk Tyr Met Asn 
180 185 \l90 

Ser Phe Leu Asn Ala Asn Asn Gin Ala Gly Gly lie Phe G^n Asn Asn 
195 200 205 

Thr Asn Gin Ala Tyr Gly Asn Gly Val Thr Ala Gin Gin lie Ala Tyr 
210 215 220 

lie Leu Lys Gin Ala Ser lie Thr Met Gly Pro Ser Gly Asp Ser Gly 
225 230 235 24\0 

Ala Ala Ala Ala Phe Leu Asp Ala Ala Leu Ala Gin His Val Phe Asn 

245 250 255 
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er Ala Asn Ala Gly Asn Asp Leu Ser Ala Lys Glu Phe Thr Ser Leu 
\ 260 265 270 

Val Gin Asn lie Val Asn Asn Ser Gin Asn Ala Leu Thr Leu Ala Asn 
275 280 285 

Asn AlaNAsn lie Ser Asn Ser Thr Gly Tyr Gin Val Ser Tyr Gly Gly 
290 \ 295 300 

Asn lie Aspv Gin Ala Arg Ser Thr Gin Leu Leu Asn Asn Thr Thr Asn 
305 \ 310 315 320 

Thr Leu Ala Lys Val Ser Ala Leu Asn Asn Glu Leu Lys Ala Asn Pro 
\325 330 335 

Trp Leu Gly Asn Phe Ala Ala Gly Asn Ser Ser Gin Val Asn Ala Phe 
340 \ 345 350 

Asn Gly Phe lie Thr Lys lie Gly Tyr Lys Gin Phe Phe Gly Glu Asn 
355 \^ 360 365 

Lys Asn Val Gly Leu Arg Tyr Tyr Gly Phe Phe Ser Tyr Asn Gly Ala 
370 375 380 

Gly Val Gly Asn Gly Pro ThrNTyr Asn Gin Val Asn Leu Leu Thr Tyr 
385 390 V 395 400 

Gly Val Gly Thr Asp Val Leu Tyrv Asn Val Phe Ser Arg Ser Phe Gly 

405 \ 410 415 

Ser Arg Ser Leu Asn Ala Gly Phe Phe Gly Gly lie Gin Leu Ala Gly 
420 425Nv 430 

Asp Thr Tyr lie Ser Thr Leu Arg Asn S^r Ser Gin Leu Ala Ser Arg 
435 440 \ 445 

Pro Thr Ala Thr Lys Phe Gin Phe Leu Phe Asp Val Gly Leu Arg Met 
450 455 \460 

Asn Phe Gly lie Leu Lys Lys Asp Leu Lys Ser Hivs Asn Gin His Ser 
465 470 475 \^ 480 

lie Glu lie Gly Val Gin lie Pro Thr lie Tyr Asn Thr Tyr Tyr Lys 

485 490 \ 495 

Ala Gly Gly Ala Glu Val Lys Tyr Phe Arg Pro Tyr Ser Val Tyr Trp 
500 505 5ivc 

Val Tyr Gly Tyr Ala Phe 
515 

(2) INFORMATION FOR SEQ ID NO: 3: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1557 base pairs 

(B) TYPE: nucleic acid 
) STRANDEDNESS : both 
) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL. SOURCE; 

(A) ORGAi^SM: Helicobacter pylori 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: aipA 

(ix) FEATURE: 

(A) NAME/KEY: Cl\S 

(B) LOCATION:!. .1554 



(xi) SEQUENCE DESCRIPTIONX SEQ ID NO: 3 



ATG ATA AAA AAG AAT AGA ACG CTGNTTT CTT AGT 
48 V 

Met lie Lys Lys Asn Arg Thr Leu PWe Leu Ser 
520 525 



CTA GCC CTT TGC GCT 
Ala Leu Cys Ala 



AGC ATA AGT TAT GCC GAA GAT GAT GGA 
96 

Ser lie Ser Tyr Ala Glu Asp Asp Gly GlyVPhe 

535 540 ^45 

CAG CTC GGG CAA GTC ATG CAA GAT GTC CAA AAl 
144 

Gin Leu Gly Gin Val Met Gin Asp Val Gin Asn 

555 560 

AGC GAC GAA CTC GCC AGA GAG CTT AAC GCT GAT 
192 

Ser Asp Glu Leu Ala Arg Glu Leu Asn Ala Asp 
570 575 

TTA AAC AAC AAC ACC GGA GGC AAC ATC GCA GGG 
240 

Leu Asn Asn Asn Thr Gly Gly Asn lie Ala Gly 
585 590 

TTC TCC CAA TAC CTT TAT TCG CTT TTA GGG GCT 
288 

Phe Ser Gin Tyr Leu Tyr Ser Leu Leu Gly Ala 
600 605 

AAT GGT AGC GAT GTG TCT GCG AAC GCT CTT TTA 
336 

Asn Gly Ser Asp Val Ser Ala Asn Ala Leu Leu 

615 620 625 



ACC GTC 
Thr Val 



GGT TAT 

Gly Tyr 
550 




GGC GGC GCT AAA 

Gly Gly Ala Lys 
565 

AAC ATT 
Asn lie 

AAC GCT 
Asn Ala 
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TCT \GG 

38'4 
Ser cty 



ACT CAA 

432 
Thr Gin 



ACT GAC 

480 
Thr Asp 



AAC ACC 

528 
Asn Thr 
680 

GGG AAT 

576 
Gly Asn 
695 

GGA ACT 

624 
Gly Thr 



CAA ACT 

672 
Gin Thr 



ATC TTA 

720 
lie Leu 



GAA GCT 

768 
Glu Ala 
760 

AAT TCA 

816 
Asn Ser 
775 

CAA GCG 

864 
Gin Ala 



ACT TGT GCG GCT 
Thr Cys 



ACG GCT GGT GGC 

Thr Ala Gly Gly 
640 



ACT TCT 
Thr Ser 



GGC TAT TAC TGG CTC CCT 
Tyr Trp 



Gly Tyr 
655 

GGC AGC 
Gly Ser 



Leu Pro 
660 



CAG ACT AAC TAC 

Gin Thr Asn Tyr 
675 

CTC ACC TAC TTG 

Leu Thr Tyr Leu 
690 



CTT AAC 

Leu Asn 
645 

AGC TTG 
Ser Leu 

GGC ACG 
Gly Thr 

AAT GCG 
Asn Ala 




GAG AAT AAG AAT 

Glu Asn Lys Asn 
710 



GAT GGT 

Asp Gly 
725 



GCC GCT TTT ACA GGT TTG 

Gly Leu 



Ala Ala Phe Thr 
780 

GTT TAT AAC GAG 

Val Tyr Asn Glu 
795 



GTG CAA GGC ATT 
Val Gin 



Gly He 
785 



CTC ACT AAA AAC 

Leu Thr Lys Asn 
800 



CTC TTA 
Leu Leu 

GAA TTC 
Glu Phe 



ATT>GAT CAA TCT 
He A 



ACC ATT AGC GGG 
Thr He 



Gin Ser 
790 



;T GCG 



Ser Gly Sef\Ala 
805 
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GTT ATT AGC GCT GGG ATA AAC TCC AAC CAA GCT AAC GCT GTG CAA GGG 
\912 

Val\lle Ser Ala Gly lie Asn Ser Asn Gin Ala Asn Ala Val Gin Gly 
810 815 820 

CGC GCT AGT CAG CTC CCT AAC GCT CTT TAT AAC GCG CAA GTA ACT TTG 
960 \ 

Arg Ala Ser Gin Leu Pro Asn Ala Leu Tyr Asn Ala Gin Val Thr Leu 

8^*25 830 835 

GAT AAA ATCx AAT GCG CTC AAT AAT CAA GTG AGA AGC ATG CCT TAC TTG 
1008 \ 

Asp Lys lie Asn Ala Leu Asn Asn Gin Val Arg Ser Met Pro Tyr Leu 
840 \ 845 850 

CCC CAA TTC AGA\GCC GGG AAC AGC CGT TCA ACG AAT ATT TTA AAC GGG 
1056 

Pro Gin Phe Arg Ala Gly Asn Ser Arg Ser Thr Asn lie Leu Asn Gly 

855 \^860 865 870 

TTT TAC ACC AAA ATA GGC TAT AAG CAA TTC TTC GGG AAG AAA AGG AAT 

1104 \ 

Phe Tyr Thr Lys lie Gly^ Tyr Lys Gin Phe Phe Gly Lys Lys Arg Asn 

875 \ 880 885 

ATC GGT TTG CGC TAT TAT GGT TTC TTT TCT TAT AAC GGA GCG AGC GTG 

1152 \ 

lie Gly Leu Arg Tyr Tyr Gly \Phe Phe Ser Tyr Asn Gly Ala Ser Val 
890 \^ 895 900 

GGC TTT AGA TCC ACT CAA AAT AATV GTA GGG TTA TAC ACT TAT GGG GTG 

1200 \ 

Gly Phe Arg Ser Thr Gin Asn Asn Val Gly Leu Tyr Thr Tyr Gly Val 
905 910 \ 915 

GGG ACT GAT GTG TTG TAT AAC ATC TTT AGC CGC TCC TAT CAA AAC CGC 
1248 

Gly Thr Asp Val Leu Tyr Asn lie Phe Ser\Arg Ser Tyr Gin Asn Arg 

920 925 , \. 930 

TCT GTG GAT ATG GGC TTT TTT AGC GGT ATC CaK TTA GCC GGT GAG ACC 
1296 

Ser Val Asp Met Gly Phe Phe Ser Gly lie Gin I>eu Ala Gly Glu Thr 
935 940 945 \ 950 

TTC CAA TCC ACG CTC AGA GAT GAC CCC AAT GTG AAA I^TG CAT GGG AAA 
1344 

Phe Gin Ser Thr Leu Arg Asp Asp Pro Asn Val Lys Leik His Gly Lvs 

955 960 \ 965 

ATC AAT AAC ACG CAC TTC CAG TTC CTC TTT GAC TTC GGT Am, AGG ATG 
1392 

lie Asn Asn Thr His Phe Gin Phe Leu Phe Asp Phe Gly Met A^ Met 

970 975 980 
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J^J^^KkC TTC GGT AAG TTG GAG GGG AAA TCC AAC CGC CAC AAC 




Lys Leu Asp Gly Lys Ser Asn 

990 



GGC GTA GTG GTG CCT ACG ATT TAT AAC ACT 



Tyr Asn Thr 
1010 



i>4 4 0 
Asn ^he Gly 
985 

GTG GAA^TTT 

1488 
Val Glu Ph^ 
1000 

TCA GCA GGG 

1536 
Ser Ala Gly 
1015 

TCT TAT GGG 

1557 
Ser Tyr Gly 



(2) INFORMATION FOR SEQ ID 

(i) SEQUENCE CHARACTEI^ISTICS 

(A) LENGTH: 518 amii^o acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ 



Arg His Asn 
995 



Gly Val Val Val Pro Thr lie 
1005 

A^T^CC GTG AAG TAT TTC CGT 

Thr Thr Val Lys Tyr Phe Arg 
\^1020 

TAT TCA TTC TAA 

Tyr Ser Phe 
1035 
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CAG CAC ACG 
Gin His Thr 

TAT TAC AAA 
Tyr Tyr Lys 



CCT TAT AGC GTT TAT TGG 



Pro Tyr Ser 
1025 



Val Tyr Trp 
1030 




Met 
1 

Ser 



Gin 

Ser 

Leu 
65 

Phe 
Asn 




lie Lys Lys Asn Arg Thr Leu Phe 
5 

lie Ser Tyr Ala Glu Asp Asp Gly Gly Pt^ 
20 25 

Leu Gly Gin Val Met Gin Asp Val Gin Asn 
35 40 

Asp Glu Leu Ala Arg Glu Leu Asn Ala Asp 
50 55 

Asn Asn Asn Thr Gly Gly Asn lie Ala Gly 

70 75 



Leu Ala Leu Cys Ala 
15 

Phe Thr Val Gly Tyr 
30 



Pro Gly Gly Ala Lys 
45 

Asn Asn lie 



Ser Gin Tyr Leu Tyr Ser Leu Leu Gly Ala Tyr Pro Thr £kys Leu 
85 90 




Ala Leu S^sr Asn Ala 

80 



Gly Ser Asp Val Ser Ala Asn Ala Leu Leu 
100 105 



Ser Gly Thr Cys Ala Ala Ala Gly Thr Ala Gly 
115 120 



Ser Gly Ala Val 
110 

Gly Thr Ser Leu Asn 
125 
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Gin Ser Thr Cys Thr Val Ala Gly Tyr Tyr Trp Leu Pro Ser Leu 
130 135 140 

Thr Asp Arg lie Leu Ser Thr lie Gly Ser Gin Thr Asn Tyr Gly Thr 
145 V 150 155 160 

Asn Thr^Asn Phe Pro Asn Met Gin Gin Gin Leu Thr Tyr Leu Asn Ala 
\ 165 170 175 

Gly Asn Val \phe Phe Asn Ala Met Asn Lys Ala Leu Glu Asn Lys Asn 
1*80 185 130 

Gly Thr Ser Ser\Ala Ser Gly Thr Ser Gly Ala Thr Gly Ser Asp Gly 
195 \ 200 205 

Gin Thr Tyr Ser Thr\Gln Ala lie Gin Tyr Leu Gin Gly Gin Gin Asn 
210 \ 215 220 

lie Leu Asn Asn Ala Ala\Asn Leu Leu Lys Gin Asp Glu Leu Leu Leu 
225 230\ 235 240 

Glu Ala Phe Asn Ser Ala Val Ala Ala Asn lie Gly Asn Lys Glu Phe 

245 \ 250 255 

Asn Ser Ala Ala Phe Thr Gly Leu Val Gin Gly lie lie Asp Gin Ser 
260 \265 270 

Gin Ala Val Tyr Asn Glu Leu Thr lV^ Asn Thr lie Ser Gly Ser Ala 
275 280 \ 285 

Val lie Ser Ala Gly lie Asn Ser Asn Gin Ala Asn Ala Val Gin Gly 
290 295 \ 300 

Arg Ala Ser. Gin Leu Pro Asn Ala Leu Tyr Asn Ala Gin Val Thr Leu 
305 310 315 320 

Asp Lys lie Asn Ala Leu Asn Asn Gin Val Arg^Ser Met Pro Tyr Leu 

325 330 \ 335 

Pro Gin Phe Arg Ala Gly Asn Ser Arg Ser Thr AsiNlle Leu Asn Gly 
340 345 \ 350 

Phe Tyr Thr Lys lie Gly Tyr Lys Gin Phe Phe Gly Lys\Lys Arg Asn 
355 360 365 ^ 

lie Gly Leu Arg Tyr Tyr Gly Phe Phe Ser Tyr Asn Gly AlaVSer Val 
370 375 380 

Gly Phe Arg Ser Thr Gin Asn Asn Val Gly Leu Tyr Thr Tyr GlV Val 
385 390 395 N400 

Gly Thr Asp Val Leu Tyr Asn He Phe Ser Arg Ser Tyr Gin Asn Ai 

405 410 415 



\ 

\ 

s 
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er Val Asp Met Gly Phe Phe Ser Gly He 
420 425 

In Ser Thr Leu Arg Asp Asp Pro Asn 
435 440 

He Asriv Asn Thr His Phe Gin Phe Leu Phe 
450 \ 455 

Asn Phe Gly Lys Leu Asp Gly Lys Ser Asn 
465 \^ 470 

Val Glu Phe Gly Val Val Val Pro Thr He 

\ 485 490 

Ser Ala Gly Thr Thr Val Lys Tyr Phe Arg 
500 \ 505 

Ser Tyr Gly Tyr Ser^ Phe 
515 \ 

(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65X base pairs 

(B) TYPE: nuclexc acid 

(C) STRANDEDNESS\ double 

(D) TOPOLOGY: bot^h 
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Gin Leu Ala Gly Glu Thr 
430 

Val Lys Leu His Gly Lys 
445 

Asp Phe Gly Met Arg Met 
460 

Arg His Asn Gin His Thr 
475 480 

Tyr Asn Thr Tyr Tyr Lys 

495 

Pro Tyr Ser Val Tyr Trp 
510 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 567. . 656 



(xi) SEQUENCE DESCRIPTION: SEqXid NO: 5: 

AGATCTATGA ATCTATGATA TCAACACTCT TTTTGATAAA TTTTCTCGAG GTACCGAGCT 
60 \ 

TGAGGCATCA AATAAAACGA AAGGCTCAGT CGAAAGACTG GGCCTTTCGT TTTATCTGTT 

GTTTGTCGGT GAACGCTCTC CTGAGTAGGA CAAATCCCeC GGGAGCGGAT TTGAACGTTG 
180 

CGAAGCAACG GCCCGGAGGG TGGC6GGCAG GACGCCCGCC ^AAACTGCC ACAAGCTCGG 
240 



TACCGTTGAT CTTCCTATGG TGCACTCTCA GTACAATCTG CTCkSATGCG CTACGTGACT 
300 

GGGTCATGGC TGCGCCCCGA CACCCGCCAA CACCCGCTGA CGCGCC^GA CGGGCTTGTC 
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TGCTCCCGGC ATCCGCTTAC AGACAAGCTG TGACCGTCTC CGGGAGCTGC ATGTGTCAGA 
420> 

ggttttca\c gtcatcaccg aaacgcgcga ggcccagcgc ttcgaacttc tgatagactt 

480 

cgaaattaat X^gactcact atagggagac cacaacggtt tccctctaga aataattttg 

540 



TTTAACTTTA AGAA^GAGAT ATACAT ATG AAA CTG ACT CCC AAA GAG TTA GAG 
593 

Met Lys Leu Thr Pro Lys Glu Leu Asp 



520 



525 



AAG TTG ATG CTC CAC TA^ GOT GGA GAA TTG GOT AAA AAA CGC AAA GAA 
641 

Lys Leu Met Leu His Tyr ^la Gly Glu Leu Ala Lys Lys Arg Lys Glu 
530 V 535 540 

AAA GGC ATT AAG CTT 
656 

Lys Gly lie Lys Leu 
545 



(2) INFORMATION FOR SEQ ID NO : sN 

(i) SEQUENCE CHARACTERISTIC^ 

(A) LENGTH: 3 0 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID N0\ 6 : 

Met Lys Leu Thr Pro Lys Glu Leu Asp Lys Le\ Met Leu His Tyr Ala 
1 5 10 \ 15 

Gly Glu Leu Ala Lys Lys Arg Lys Glu Lys Gly iJse Lys Leu 
20 25 \ 30 



